Automatic generation of visual scenarios for spoken corpora acquisition

نویسندگان

  • Demetrio Aiello
  • Cristina Delogu
  • Renato De Mori
  • Andrea Di Carlo
  • Marina Nisi
  • Silvia Tummeacciu
چکیده

The paper describes a system, in JAVA, for written and visual scenario generation to collect speech corpora in the framework of a Tourism Information System. Methods and experimental results are also presented for evaluating the degree of understanding of the proposed scenarios. The corpus generated from visual scenarios appears to be much richer than the one generated from textual descriptions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Extraction of Subcategorization Data from Spoken Language

Subcategorization data has been crucial for various NLP tasks. Current method for automatic SCF acquisition usually proceeds in two steps: first, generate all SCF cues from a corpus using a parser, and then filter out spurious SCF cues with statistical tests. Previous studies on SCF acquisition have worked mainly with written texts; spoken corpora have received little attention. Transcripts of ...

متن کامل

How Spoken Language Corpora Can Refine Current Speech Motor Training Methodologies

The growing availability of spoken language corpora presents new opportunities for enriching the methodologies of speech and language therapy. In this paper, we present a novel approach for constructing speech motor exercises, based on linguistic knowledge extracted from spoken language corpora. In our study with the Dutch Spoken Corpus, syllabic inventories were obtained by means of automatic ...

متن کامل

Automatic Extraction of Subcategorization Frames from Spoken Corpora

We built a system for automatically extracting subcategorization frames (SCFs) from corpora of spoken language. The acquisition system, based on the design proposed by Briscoe & Carroll (1997) consists of a statistical parser, a SCF extractor, an English lemmatizer, and a SCF evaluator. These four components are applied in sequence to retrieve SCFs associated with each verb predicate in the cor...

متن کامل

Automatic lexicon generation and dialogue modeling for spontaneous speech

This paper describes novel framework for dialogue modeling based on a superword model, a superset of word n-gram. This has a remarkable advantage, because only transcribed text is needed to obtain the model, and no word dictionary is needed. In this paper, it is shown that the expressions specific to dialogue speech are extracted automatically from the transcriptions of spoken dialogue corpora ...

متن کامل

Automatic generation of phonetic transcriptions for large speech corpora

We describe a method for the automatic production of phonetic transcriptions in large speech corpora. First, we focus on the application of different techniques for the generation of pronunciation variants. Then, we explain the application of a speech recognition system for selecting the acoustically best matching phonetic transcription. The system is evaluated on different test sets selected f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998